On the Monte Carlo weights in multiple criteria decision analysis

In multiple-criteria decision making/aiding/analysis (MCDM/MCDA) weights of criteria constitute a crucial input for finding an optimal solution (alternative). A large number of methods were proposed for criteria weights derivation including direct ranking, point allocation, pairwise comparisons, entropy method, standard deviation method, and so on. However, the problem of correct criteria weights setting persists, especially when the number of criteria is relatively high. The aim of this paper is to approach the problem of determining criteria weights from a different perspective: we examine what weights’ values have to be for a given alternative to be ranked the best. We consider a space of all feasible weights from which a large number of weights in the form of n−tuples is drawn randomly via Monte Carlo method. Then, we use predefined dominance relations for comparison and ranking of alternatives, which are based on the set of generated cases. Further on, we provide the estimates for a sample size so the results could be considered robust enough. At last, but not least, we introduce the concept of central weights and the measure of its robustness (stability) as well as the concept of alternatives’ multi-dominance, and show their application to a real-world problem of the selection of the best wind turbine.


Introduction
Multiple criteria decision making/aiding/analysis (MCDM/MCDA) methods represent one of the most successful tools for sophisticated decision making in the framework of complex realworld problems usually involving many alternatives which should be compared and ranked under a set of suitable criteria. A state-of-the-art on MCDM/MCDA methods can be found e.g. in [1][2][3] or [4]. One of the main challenges associated with MCDM/MCDA that attracted a broad range of studies is the problem of appropriate (objective) derivation of criteria weights, see e.g. [5][6][7][8][9][10][11][12][13][14][15], or [16]. According to [9], the methods for the derivation of criteria weights fall into three categories-subjective weighting methods, objective weighting methods and hybrid weighting methods. Subjective methods depend on the preferences of decision-makers and include direct ranking, point allocation, pairwise comparisons, or SMART (Simple Multi-Attribute Ranking Technique). The main disadvantage of these methods is that they are recourse-consuming when the number of criteria increases. Objective weighting methods utilize specific computational process based on the initial data or decision-matrix, and are not based on experts' preferences or judgements; entropy method, CRITIC (CRiteria Importance Through Inter-criteria Correlation) or SECA (Simultaneous Evaluation of Criteria and Alternatives) belong in this category. The hybrid weighting methods, such as MEREC (MEthod based on the Removal Effects of Criteria), combine both approaches. In general, MCDM/MCDA methods suffer from two drawbacks related to criteria weights. The first is that it is almost impossible to set weights of criteria precisely (perfectly) so that the optimal solution is obtained indeed. This occurs, in particular, when facing a new challenge or a novel and a decision maker is lacking corresponding knowledge and/or experience, typically when cutting edge science or technology is involved. However, even relatively simple tasks constitute a problem. For instance, QS university ranking, which aspires to provide an ordered list of top 1,000 universities in the World as close to the reality as possible, applies the following criteria: academic reputation (40%), employer reputation (10%), faculty/student ratio (20%), citations per faculty (20%), international student and faculty ratio (both 5%). However, how do we know that these exact criteria weights provide a truly objective and unbiased ranking of universities?
The second drawback relates to the well-known high sensitivity of criteria weights with respect to the final evaluation of alternatives. In some cases even the slightest change in criteria weights may lead to a diametrically different ranking of alternatives and/or the change of the best one, see e.g. [17], and this is the case when natural uncertainties in the evaluation of criteria or alternatives arise. NASA belongs among institutions famous for their meticulous approach to the space exploration and problem solving in general. Their NASA Systems Engineering Handbook in the part 6.8 Decision Analysis states [18]: Once the decision alternative evaluation is completed, recommendations should be brought back to the decision maker including an assessment of the robustness of the ranking (i.e., whether the uncertainties are such that reducing them could credibly change the ranking of the alternatives). Generally, a single alternative should be recommended. However, if the alternatives do not significantly differ, or if uncertainty reduction could credibly alter the ranking, the recommendation should include all closely ranked alternatives for a final selection by the decision-maker.
This paragraph clearly acknowledges uncertainty in the evaluation of alternatives and stresses the importance of a robustness analysis.
In this paper we turn over the usual perspective on MCDM/MCDA problems. Instead of asking which alternative is the best under given criteria weights we ask what the values of criteria weights have to be so that given alternative is ranked first. Further on, we ask what smallest change in criteria weights leads to the change of the best alternative and whether there are feasible weights such that any alternative could be ranked first at all.
The aim of our study is to answer these questions by proposing a novel Monte Carlo weights' approach. We consider the set of all feasible weights (a subspace of an n-dimensional space, where n is the number of criteria) from which a large number of weights (see Section 4 for estimates of the sample size with respect to the relative standard error of the mean) in the form of n-tuples is drawn randomly from a uniform probability distribution via Monte Carlo method. Then, we apply predefined dominance relations for comparison and ranking of alternatives, and we provide an analysis of the sensitivity (robustness) of the aforementioned solutions by introducing a concept of the so called central weights and their radius. Moreover, we show that although an alternative can be non-dominated when compared with a single other alternative, a group domination (called multi-domination) may appear: an alternative might be dominated by a subset of other alternatives for all feasible weights. The identification of multi-dominated alternatives can significantly reduce the number of alternatives under consideration as demonstrated in the application part of the paper. At last, but not least, our approach accentuates the problem of uncertainty mentioned in the NASA Systems Engineering Handbook by modelling the values of criteria weights, thus allowing examination of their influence on alternatives' final rankings.
The organization of paper is as follows: in Section 2 we provide a brief introduction to the Monte Carlo method, Monte Carlo weights and dominance relations along with an illustrative numerical example. In Section 3 we demonstrate the application of our approach to the analysis of a concrete real-world problem, namely a selection of the best wind turbine. Discussion (Section 4) and Conclusions (Section 5) close the article.

The Monte Carlo method
In general, the term Monte Carlo method refers to a broad variety of algorithms that obtain numerical results via (many times) repeated random sampling from a given probability distribution. See e.g. [19][20][21][22] or [23] for an introduction to the Monte Carlo method.
History of Monte Carlo method dates back to the Buffon's needle problem for the derivation of the value of π from the 18 th century. The modern version of the Monte Carlo method was pioneered during the World War II by Stanislaw Ulam and John von Neumann [22,24]. Since then, Monte Carlo method was successfully applied in physics (see e.g. McKean-Vlasov processes), mathematics (complex multidimensional definite integrals), economics (Markov chains, risk), engineering (oil extraction), biology (study of genomes and proteins), medicine (radiotherapy), sports, or operations research (optimization problems), see e.g. [25][26][27][28], or [29]. In the context of pairwise comparisons, Monte Carlo studies were applied for instance in [30][31][32][33][34][35][36][37][38], or [39]. Nevertheless, a comprehensive review of Monte Carlo applications is beyond the scope of this study. In [28] Alex Bielajew states that up to year 2011, more than 300,000 papers were published on the Monte Carlo method, with 10% of papers related to medicine only.
Currently, the Monte Carlo method constitutes a popular modelling method in a wide areas of human action supported by many software products such as GoldSim, NIST, or B-RISK, and Monte Carlo simulation modules are also included in MS Excel as XLSTAT, in the statistical software SPSS, in MATLAB, or in the programming language R.
Usually, the Monte Carlo methods follow the following steps: 1) the domain of sampling and probability distribution are defined, 2) a large number of random draws (with repeating) is performed, 3) results are aggregated, analysed and interpreted.
The application of Monte Carlo method requires a random (unbiased) sampling from a given probability distribution. In practice, pseudo-random sequences are generated by a class of algorithms called pseudorandom number generator (PRNG), or a deterministic random bit generator (DRBG), see e.g. [40]. For example, MS Excel uses the Mersenne Twister algorithm (MT19937). Pseudo-random sequences are easy to test and re-run. The only quality usually necessary to make a good simulation is that a pseudo-random sequence is 'random enough'.
The second crucial feature of the Monte Carlo simulations is their error. Each time a Monte Carlo simulation is performed, slightly different results (mean values) are obtained. The variability of results (i.e., how much the mean estimate varies from one Monte Carlo simulation to another Monte Carlo simulation) depends on the number N of trials in each Monte Carlo simulation.
Let x i , i 2 {1, . . ., N} denote the individual randomly generated values, let N be the sample size, let � x denote the mean value of the sample and let s 2 x be its variance. When Monte Carlo simulations are repeated, the mean values � x will slightly differ (variances are assumed to be identical). The variance of the mean s� x 2 is then given as follows [41]: Thus, the standard error (deviation) of the mean s � x decreases with the square root of the sample size N in each Monte Carlo simulation. This relation does not depend on the underlying probability distribution.
In our approach, the criteria weights are randomly drawn from the uniform probability distribution. For the uniform probability distribution, where x i 2 [a, b], the variance s 2 x is given as follows: The relations above enable estimation of the sample size N so that a standard error is under a desired threshold, see Section 4 for more details.

Monte Carlo weights and dominance relations
. ., C n } be the set of n criteria and let w = (w 1 , w 2 , . . ., w n ) be the vector of criteria weights such that Since in our approach the weights of criteria are randomly generated from the interval [a, b [ by the Monte Carlo method, we will denote these weights as Monte Carlo weights (MC weights in short). For practical purposes the number of generated cases of these weights is recommended to be at least in thousands, see e.g. [42,43], or [44], but see Section 4 for more details.
Further on, let's assume that all alternatives are evaluated with respect to all criteria, and f ij denotes the evaluation of the i-th alternative under j-th criterion, where f ij 2 R. The matrix F = (f ij ) is called the decision matrix. Further on, let U(A i ) be a (cardinal) utility function of an alternative i: Next, we propose the following dominance relations for alternatives' comparison and ranking.

Definition 1 Let N be the number of cases of Monte Carlo weights
From a multiplicative PC matrix alternatives' weights (also called a priority vector) can be easily derived by the eigenvalue method or the geometric mean method, see e.g. [45,46], or [39].
Further on, we define central weights (the most stable weights) for each alternative as follows: Definition 4 Let f ij be the evaluation of alternative i with respect to criterion j. Let w = (w 1 , . . ., w n ) be a vector of MC weights of all criteria. Let U (i) denote a utility function of alternative i. Let W (i) = {w|U i � U j , 8j} be a "space" of weights for which alternative i is the best (attains the maximum value of a utility function). Further on, let w ðiÞ � 2 W ðiÞ denote weights for which two conditions are satisfied: i) There exists a neighbourhood in the form of an open "ball" N � W (i) such that w ðiÞ � is its centre, and r > 0 is its radius.
ii) The radius r is maximal. Then the w ðiÞ � is called the central weights w.r.t. alternative i. Obviously, the greater is the value of r from Definition 4, the greater is the necessary change in weights from central weights to replace the best alternative i with another best alternative. In this sense, r expresses stability or robustness of the central weights. Also, it should be mentioned that the previous definition utilizes the notion of a distance (between weights), hence a suitable metric function must be selected in practice. Therein after, it is assumed that the Manhattan metric is such a suitable metric, see also [47].
The best or optimal alternative in MCDM/MCDA problems always belongs to the set of non-dominated alternatives. This means that given the set of alternatives A = {A 1 , . . ., A k } and the set of criteria C = {C 1 , . . ., C n }, alternative A i dominates alternative A j (we write A i � A j ) if for all j = {1, . . ., n} it holds that A i is evaluated better or equally as A j , but at least one preference is strict.
Next, we provide a generalization of the concept of dominance.
In other words, if an alternative j is dominated by a set of alternatives according to Definition 5, it is never ranked as the best one. We recommend to set the lower bound N 0 of the number of randomly generated MC weights to 10,000 in accordance with [42,43], or [44].
While the case of one alternative dominance over another alternative can be called singledominance (s-dominance in short), the case of the dominance of a set over one alternative can be referred to as multi-dominance (m-dominance in short). It should be noted that while sdominance implies m-dominance, the inverse is not true in general.
To summarize, the proposed Monte Carlo weights method for multiple criteria decision analysis proceeds in the following steps (that slightly differ with regard to the dominance relation involved): 1) The sets of alternatives A and criteria C along with the decision matrix F = (f ij ) form the method's input. Also, a probability distribution of random draws of weights is set (usually it is uniform distribution).
2) A large number of criteria weights is generated randomly via Monte Carlo method such that each weight w i is drawn independently from (the same) open interval ]a, b[. The dominance relation is selected.
3i) For the dominance relation from Definition 1: For each generated case of the MC weights the best alternative (the alternative with the highest value of a utility function) is found.
4i) Results are aggregated over all generated cases and the values of B i are found. 5i) All alternatives are ranked via the dominance relation from Definition 1 from the best to the worst.
3ii) For the dominance relation from Definition 2: For each generated case of the MC weights all alternatives are pairwise compared with respect to their utility function.
4ii) Results are aggregated over all generated cases and the values of D ij are found. 5ii) All alternatives are ranked via the dominance relation from Definition 2 from the best to the worst. 3iii) For the dominance relation from Definition 3: For each generated case of the MC weights the value of a utility function is calculated for each alternative.
4iii) Results are aggregated over all generated cases and the values of U i mean are found. 5iii) All alternatives are ranked via the dominance relation from Definition 3 from the best to the worst. 6) At this final step central weights w� and radius r are estimated for each alternative. Fig 1 shows a simplified flow chart of the method. During the procedures above a decision maker may identify an alternative that is multi-dominated, i.e. never best and thus irrelevant. In such a case it is recommended to remove this alternative from further consideration.

An illustrative numerical example
Let's consider 5 alternatives {A 1 , A 2 , A 3 , A 4 , A 5 }, and 3 criteria {C 1 , C 2 , C 3 }. All alternatives are evaluated on the scale from 1 (the worst) to 10 (the best), see Table 1. Weights of all three criteria are unknown. The goal is to find the best alternative.
To solve the problem we use the Monte Carlo weights method with 10,000 randomly generated criteria weights. For each generated case, the dominance relations from Definitions 1-3 were applied and the results are presented in Tables 2-5. We use the same number of generated cases (a sample size) through the paper purely for practical reasons-we built our Monte Carlo simulation tool with 10,000 cases. This simple size is usually more than sufficient and provides robust results, however, a researcher may adjust the sample size with respect to the desired accuracy, see Discussion (Section 4) for more details.
As can be seen, the best alternative (a Condorcet winner) is A 4 followed by A 1 . Figs 2 and 3 illustrate a 'space' of weights for which a given alternative is ranked best. Rankings of all alternatives with respect to dominance relations from Definitions 1-3 are provided in Table 6. The weights of all alternatives from the PC matrix in Table 5    A natural question regarding this or other problems associated with the Monte Carlo method arises: how many cases should be randomly generated so that the result is robust enough. We provide an answer to this question in Section 4. Here, we show the convergence of U mean (A 1 ) with the growing number N of generated cases, see

Application of Monte Carlo weights to the wind turbine selection
In the study of Rehman and Khan [48], the task of finding the best wind turbine for a wind power plant was performed. The authors gathered data about 18 wind turbines and evaluated their properties with regard to five criteria: hub height (C 1 ), rotor diameter (C 2 ), cut-in speed of wind (C 3 ), rated speed of wind (C 4 ) and rated power (C 5 ). Every criterion had the weight equal to 0.20. The data were normalized, criteria C 1 − C 4 were minimization ones, criterion C 5 was originally a maximization one, so it was transformed into a minimization one by taking its inverse. After the transformation all criteria were minimization ones and the best turbine was the turbine with the lowest weighted sum-Fuhrlander FL 600, see Table 7.
What escaped notice of the authors of the study is that Suzlon S.52 and Suzlon S.88 turbines were dominated by other alternatives, so they could be safely removed from further consideration.
As can be seen from Table 7, differences between turbines' final scores were rather small. Therefore, it could be expected that even a small change in criteria weights might lead to a different best wind turbine. Indeed, it suffices to change the weight of the criterion C 2 to 0.195 and the weight of the criterion C 5 to 0.205 (leaving the rest of criteria weights at 0.20), and the To analyse the dependence of the best turbine on criteria weights, we applied the Monte Carlo weights method. We generated 10,000 random cases of criteria weights from the interval ]0, 1[ and applied the dominance relations from Definition 1 and 2 to compare and rank all turbines (except for the two dominated ones) and to find their respective central weights-this dataset can be found on https://doi.org/10.6084/m9.figshare.19525087.v1. The results are summarized in Tables 8 and 9.
The turbine that was the most frequently best (in 36.2% of generated cases) was Fuhrlander FL 600 in accord with the result of the original study. However, our approach provided new valuable insights into the problem. Firstly, it can be seen from the Table 7 that another 8 turbines were m-dominated and would never be ranked the best (with both Suzlon turbines mentioned above 10 turbines altogether attained 0% cases of being evaluated as the best). From the remaining 8 turbines only four turbines were ranked best in at least 10% of cases: Fuhrlander FL 600, Windflow 500, AAER A-2000-84 and Ecotecnia 80/2000. Only these four turbines could deserve a more detailed consideration. Therefore, the Monte Carlo approach allowed significant reduction of the candidates for the best solution. But the advantages of the method do not stop here.
From the central weights for each of the four best alternatives, we see what weights would favour one alternative over the others. Windflow 500 turbine would be considered the best if criteria C 1 and C 2 were the most important. AAER A-2000-84 turbine would be best if criteria C 1 and C 5 were the most important, and finally Ecotecnia 80/2000 would be ranked first if the last criterion C 5 was the most important and the first criterion C 1 was the least important. After this analysis a decision maker may weigh in which configuration of criteria weights is most desirable, and then select the best option.

Discussion
The Monte Carlo weights method for the solution of MCDM/MCDA problems involving a utility function has several advantages, namely: • The method does not need precise values of criteria weights in advance since it models a large number of feasible weights so that the decision maker receives information of how the weights influence the results.
• The method enables the evaluation and ranking of alternatives despite the unknown criteria weights.
• The method enables to find the set of weights for which a given alternative is the best. Then it is up to the decision maker to decide which weights are acceptable and which are not.
• The method enables the evaluation of a stability of the so called central weights. The concept of central weights enables the decision maker to see what weights would be necessary for each alternative to be the best one.
• The method enables, as shown in the example on the wind turbine selection, to find multidominated alternatives, which are not so obvious as their single-dominated counterparts, thus reducing the number of alternatives under consideration.
On the other hand, the Monte Carlo weights approach has its limitations. Firstly, in some real-world problems criteria weights are set a priori at given values and the analysis of what would happen if they change is irrelevant. Secondly, Monte Carlo method is both computationally costly and time demanding, and might not be useful in situations when a fast solution is needed. Other limitation constitutes the fact that we introduce Monte Carlo weights for the problems where the final aggregation of alternatives' evaluations is performed via a utility function, but many MCDM/MCDA theoretical frameworks do not incorporate a utility function. However, we believe the Monte Carlo weights can be introduced into other frameworks associated with criteria weights as well, and our future research will focus in this direction.
As mentioned in Section 2, it is useful to know the size of the randomly generated sample necessary for results to be robust enough. Hereinafter, we provide this estimate.
First we assume the criteria weights are randomly drawn from a uniform distribution on the interval ]a, b[. Let x i denote randomly generated values of weights of a given criterion (does not matter which one, since they are treated equally), let � x ¼ ðb À aÞ=2 denote the mean weight of a given criterion (this value is the same for all criteria), let s 2 x ¼ ðbÀ aÞ 2 12 be the sample variance of x i and let s� x 2 be the standard error (variance) of the mean. From relations (1) and (2) it follows that : Further on, let s � x � x ¼ p, where p is the coefficient of variation of the mean (also called the standard error of the mean or the relative standard error). Now, let's estimate the sample size N corresponding to the relative standard error p: The relation (6) provides the relationship between the sample size N and the relative standard error p (given as a decimal number) of a given generated weight. The smaller is p, the more 'fairly' are the weights generated (no weight is, on average, higher than some other weight), however, the price is a large sample N.
For reader's convenience, we provide the sample sizes N for different values of p in the following Table 10.
It should be noted that the sample sizes N provided by relation (6) and shown in Table 10 are only estimates since a sample variance of x i is used instead of (unknown) population variance, and the assumption of (totally) random draws might not be fulfilled in practice due to the application of pseudo-random generators mentioned in Section 2.1.
An estimate of the sample size with respect to the relative standard error of a utility function U can be derived as well. Assume the utility function from relation (3) and recall the following formula for the variance of a linear combination of two uncorrelated (independent) variables (x, y): Now, let's estimate the relative standard error of the mean of the utility function of an alternative j (we assume f ij � 0). The variance of U j is given as: Therefore, the variance of the mean of U j is given as follows: And the relative standard error p is given as:  Finally, from Eq (10) we easily derive N: ð11Þ Since for f ij � 0 the following inequality holds: We get that the following estimate: The sample size estimate for the relative standard error of the mean of a utility function is thus lower than the sample size estimate for the relative standard error of the mean of a given weight.
Another interesting problem is whether it is possible to obtain a set of weights W for which a given alternative is the best (has the highest value of the utility function U), see Definition 1, analytically, without simulations. Assume that there are at least two alternatives and that weights of criteria w i 2 ]0, 1[. This task means to solve a system of linear inequalities where the number of inequalities equals the number of alternatives (k) minus one plus 2n 'structural inequalities' (0 < w i < 1), and the number of variables is equal to the number of criteria (n). Suppose that we want to find the set W 1 for Alternative 1. Then, the system of inequalities is given as follows: . . . f 11 w 1 þ f 12 w 2 þ � � � þ f 1n w n � f k1 w 1 þ f k2 w 2 þ . . . þ f kn w n 0 < w i < 1; 8i: This set of inequalities can be solved by the Fourier-Motzkin elimination (FME), see e.g. [49]. In each step of the FME, one variable is eliminated from the system, but new inequalities are added, until only one variable remains and its value can be expressed as an interval. Let's assume that the solution to the system above was found and has the following form, where L and U denote the lower and upper bounds for each weight w i : The set W 1 forms a polyhedron in an n-dimensional unit cube. For the comparisons with other alternatives with regard to Definition 1, instead of counting the number of generated cases for which Alternative 1 is the best, we have to find the volume of the set W 1 . This volume is given as an n-dimensional definite integral: However, the downside of the FME is that the number of inequalities grows doubly exponentially [50,51]. At most, one can expect to get 4ð k 4 Þ 2 nÀ 1 inequalities for one variable, where k is the input number of inequalities and n is the number of variables [51]. Hence, for instance, with originally 8 inequalities and 4 variables one may end up (in the worst case scenario) with a system of 1,024 inequalities that leads to the solution for only one alternative. . . Therefore, it is possible to use the analytic approach, but the computational complexity makes it rather infeasible in practice except for the cases with very low numbers of alternatives and criteria.

Conclusions
In this paper we introduced the notion of the Monte Carlo weights in the MCDM/MCDA framework. We showed that alternatives can be compared and ranked even when information on criteria weights is missing or unavailable, and that our approach enables to find (the most stable) weights such that a given alternative is ranked the best, and the evaluation of its stability by finding the minimal change of criteria weights necessary to a replacement at the top of the ranking. Thus, the Monte Carlo weights method provides a valuable insight into configuration of criteria weights and its influence on alternatives' ranking.
Further on, we introduced the notion of multi-dominance, which enables to narrow the set of alternatives under consideration, and we provided estimates for Monte Carlo sample size so a desired robustness of results can be achieved.
We believe the presented approach can be useful particularly in situations when criteria weights are uncertain or difficult (impossible) to acquire, which is, in particular, the case of newly or recently emerging problems with none or insufficient previous experience.
Our further research will focus on more general framework of the Monte Carlo weights method not limited to the problems incorporating the notion of a utility function.